MSKIDS analysis

This report examines the effect of Multiple Sclerosis (MS) on brain volumes in a pediatric sample. The data used in this analysis consists of a harmonized dataset of volumetric measurements of 145 brain regions and covariates such as age, sex and MS status.

Multiple linear regression models were performed to investigate the relationship between MS and brain volume measurements. The models were run separately for each brain region of interest (ROI) and the results were reported for ROIs that showed significant associations with MS, after adjusting for multiple corrections using false discovery rate.

library(tidyverse)
library(broom)
library(shiny)

# load the data
df <- read.csv('data/harmonized_data_PNC+MS.csv', stringsAsFactors = TRUE) %>% 
  mutate(MS = as.factor(MS)) %>% 
  mutate_at(vars(starts_with('X')), ~ . *0.001) %>% # convert mm^3 to cm^3
  rename(ICV = X702,
         Sex = sex, 
         Age = age)

# capture roi indices as vector for functions
ROI_INDICES <- df %>% 
  select(starts_with("X")) %>% 
  colnames() 

# load the ROI dictionary
dict_df <- readxl::read_excel('data/MUSE_ROI_Dict.xlsx') %>% 
  filter(ROI_NAME == 'ICV' | ROI_INDEX < 300) %>% 
  select(ROI_INDEX, TISSUE_SEG, ROI_NAME) %>% 
  mutate(ROI_INDEX = paste0('X', ROI_INDEX),
         TISSUE_SEG = recode(TISSUE_SEG, `GM+WM+VN+BS+cCSF` = 'ICV')) %>% 
  filter(ROI_INDEX %in% ROI_INDICES)

Sample size

# print cell sizes and total sample size
df %>% 
  count(Sex, MS) %>% 
  bind_rows(summarize(., n = sum(n)))

Models

Multiple regression models were run to investigate the relationship between MS and brain volumes. The models were run separately for each brain region of interest (ROI) and the results were reported for ROIs that showed significant associations with MS. The models controlled for sex, age, age squared and intracranial volume (ICV). Additionally, interactions between MS and sex were also included in the model. The results of these models are reported and visualized using box plots to help interpret the findings.

The analysis shows that MS is associated with specific brain volume changes in pediatric population.

Model equation:

\[\begin{align} ROI &= \beta_0 + \beta_1*Sex + \beta_2*Age + \beta_3*Age^2 + \beta_4*Sex*Age + \beta_5*MS*Sex + \beta_6*ICV + \epsilon \end{align}\]

Model R formula:

ROI ~ Sex + Age + Age^2 + Sex*Age + MS*Sex + ICV

# No MS-sex interation Model: ROI ~ Sex + Age + Age^2 + Sex\*Age + MS + ICV
run_model <- function(outcome){
  #' run model for each ROI
  f <- as.formula(sprintf("%s ~ Sex + Age + Age^2 + Sex*Age + MS + ICV", outcome))
  res <- lm(f, data = df) 
}

# run models searately
models <- ROI_INDICES %>% 
  purrr::map(run_model) %>% 
  purrr::map(tidy) %>% 
  setNames(ROI_INDICES) %>% 
  bind_rows(.id = 'ROI_INDEX') %>%
  left_join(dict_df, by = c('ROI_INDEX')) %>% # join with ROI dictionary
  select(ROI_INDEX, ROI_NAME, TISSUE_SEG, everything())

# use FDR correction
models %>% 
  filter(term == 'MS1') %>% 
  mutate(p.value = p.adjust(p.value, method='fdr')) %>% 
  filter(p.value < 0.05)
run_model <- function(outcome){
   #' run model for one ROI
  f <- as.formula(sprintf("%s ~ Sex + Age + Age^2 + Sex*Age + MS*Sex + ICV", outcome))
  res <- lm(f, data = df)
}

# run models searately for each ROI
models <- ROI_INDICES %>% 
  purrr::map(run_model) %>% 
  purrr::map(tidy) %>% 
  setNames(ROI_INDICES) %>% 
  bind_rows(.id = 'ROI_INDEX') %>% 
  left_join(dict_df, by = c('ROI_INDEX')) %>% 
  select(ROI_INDEX, ROI_NAME, TISSUE_SEG, everything())

# select effects/terms of interest and adjust p values. keep significant effects
sig_effects <- models %>% 
  filter(term %in% c('SexMALE:MS1', 'MS1')) %>% 
  mutate(p.value.adj = round(p.adjust(p.value, method='fdr'),4)) %>% # adjust for 145*2 = 290 MCs
  filter(p.value.adj < 0.05) %>% 
  arrange(by = TISSUE_SEG)

Significant main effects: MS

With healthy controls as the reference group, the main effect (in \(cm^3\)) of having MS is shown in the estimate column for each significant ROI.

sig_effects %>% 
  filter(term == 'MS1')

Results show that across Grey Matter (GM) regions, MS patients show decreased volumes relative to controls; whereas in the \(3^{rd}\) ventricle and the left and right frontal lobes, MS patients had higher volume than controls.

Significant interaction effects

With MS females as the reference group, the interaction effect (in \(cm^3\)) of having MS and being male is shown in the estimate column for each significant ROI. These effects are independent of the main effect of sex.

sig_effects %>% 
  filter(term == 'SexMALE:MS1')

Results that show that MS males have higher volumes than females in right and left Thalamus and right inferior lateral ventricle. On the other hand MS males exhibit reduced volumes in the left frontal lobe than MS females.

Visualization

The means for the ROIs with a significant main effect (of MS status) and/or interaction (of MS status with sex) are plotted below by age and MS status.

plot_means <- function(roi, data=df) {
  #' make a box plot for a given roi
  roi_name <- dict_df %>% 
    filter(ROI_INDEX == roi) %>%
    pull(ROI_NAME)
  
  sig_effects <- sig_effects %>% 
    filter(ROI_INDEX == roi) %>% 
    pull(term) %>% 
    recode("MS1" = "main effect", "SexMALE:MS1" = "interaction")
  
  title <- ggtitle(sprintf("Significant effects of MS: %s", knitr::combine_words(sig_effects)))
  
  data %>% 
    mutate(MS=fct_recode(MS, Yes = "1", No = "0")) %>% 
    ggplot(aes_string(y=roi, x='MS', fill='Sex')) + 
    geom_boxplot() + 
    ylab(sprintf("%s (cm3)", roi_name)) +
    theme_bw() + title
    
}

## make a boxplot for ROIs with significant effects
sig_rois <- sig_effects %>% 
  pull(ROI_INDEX) %>% 
  unique()

plots <- sig_rois %>% 
  purrr::map(~ plot_means(.x), data=df) %>% 
  setNames(sig_rois)

Volumes

X59

X60

X107

X121

X122

X139

X4

X49

X81

X82

Volumes minus Age and ICV effect

# regress out nuisance effects from roi
reg_out_roi <- function(roi){
  #' run model for each ROI
  model <- lm(roi ~ Age + Age^2 + ICV, data = df) 
  model$residuals
}

reg_out_df <- df %>% 
   mutate_at(vars(starts_with('X')), reg_out_roi)
  
plots_reg_out <- sig_rois %>% 
  purrr::map(plot_means, data=reg_out_df) %>% 
  setNames(sig_rois)

X59

X60

X107

X121

X122

X139

X4

X49

X81

X82